Model Selection

Self-Distillation Training

# Self-Distillation Training

Seerattention QwQ 32B AttnGates

Introducing an attention gating (AttnGates) weight adapter for the QwQ-32B model to accelerate long-context computation through dynamic block-level sparsity

Large Language Model

Splade Cocondenser Selfdistil

SPLADE model for passage retrieval, improving retrieval effectiveness through sparse latent document expansion and knowledge distillation techniques

Transformers English

Trans Encoder Bi Simcse Roberta Large

An unsupervised sentence encoder based on RoBERTa-large, trained with self-distillation and mutual distillation techniques, suitable for sentence similarity calculation tasks.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase